Error Introduced by the Infinite - Sites Model 1

نویسندگان

  • Alan Rogers
  • Henry Harpending
چکیده

In a recent paper, Henry Harpending and I (Rogers and Harpending 1992) interpret variation in human mitochondrial DNA by using a model by Li ( 1977 ) . The model is unrealistic in several respects, one of which is its use of the “infinite sites” model (Kimura 1969 ) of molecular evolution. This model assumes that no nucleotide (or restriction) site mutates more than once, an assumption that is clearly violated in human data (Kocher and Wilson 199 1). The infinite-sites model may nonetheless be useful as an approximation, provided that the error it introduces is small. This note evaluates the relative error of this approximation. Harpending and I make use of the infinite-sites assumption at only one point in our analysis: we assume that differences between a pair of individuals are introduced at a rate, 2u per generation, which is constant in time. With a finite number of sites, this assumption cannot hold exactly, because, after a nucleotide site has been struck once by mutation, later mutations need not add to the count of differences between our pair of individuals. The more sites there are with prior mutations, the lower will be the rate at which new differences accumulate. Thus, differences accumulate at a decreasing, rather than a constant, rate. Let D(t) denote the expected number of differences over a finite number, K, of nucleotide sites between a pair of lineages that have been separated for t generations. Under the infinite-sites model, D(t) = 2Kyt, where l.t is the mutation rate per nucleotide site, and where K --* cc while p + 0, such that Kl.t remains finite. Here, I assume instead that K is finite and that mutation at the ith site follows a Poisson process with rate l.ti per generation. Thus, the number of mutations at site i since time zero is Poisson with mean 2pLit. In human data, transversions are rare. Thus, I assume that all mutations are transitions. Consider the comparison, between two lineages, at a single site, say site i. Initially, at time 0, the two lineages will be identical, since they share a common ancestor then. The first mutation along the path connecting the two lineages will make them different at site i, but the second will restore their identity, because of the assumption that all mutations are transitions. Each successive mutation at site i toggles the two lineages back and forth between the states of identity and nonidentity. The probability that the two lineages differ at site i is thus equal to the probability that the number of prior mutations there is odd. This probability is ( 1-e-4”1t)/2 (Haldane 1919). The expected sum of differences over all K sites is

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Estimation and Analysis of Caspian Region's Future Rainfalls by Using General Atmospheric Circulation Models.

In recent years, the severe fluctuations in precipitation have affected various parts of the country. On the southern coasts of the Caspian Sea, precipitation as one of the important climatic parameters has undergone changes due to global climate change. In the present study, we tried to evaluate the effect of climate change on rainfall in this region by applying a suitable model. In this study...

متن کامل

Estimation of Binary Infinite Dilute Diffusion Coefficient Using Artificial Neural Network

In this study, the use of the three-layer feed forward neural network has been investigated for estimating of infinite dilute diffusion coefficient ( D12 ) of supercritical fluid (SCF), liquid and gas binary systems. Infinite dilute diffusion coefficient was spotted as a function of critical temperature, critical pressure, critical volume, normal boiling point, molecular volume in normal boilin...

متن کامل

A numerical approach for optimal control model of the convex semi-infinite programming

In this paper, convex semi-infinite programming is converted to an optimal control model of neural networks and the optimal control model is solved by iterative dynamic programming method. In final, numerical examples are provided for illustration of the purposed method.

متن کامل

ON THE INFINITE ORDER MARKOV PROCESSES

The notion of infinite order Markov process is introduced and the Markov property of the flow of information is established.

متن کامل

Introduced a Modified Set of Boundary Condition of Lattice Boltzmann Method Based on Bennett extension in Presence of Buoyancy Term Considering Variable Diffusion Coefficients

Various numerical boundary condition methods have been proposed to simulate various aspects of the no-slip wall condition using the Lattice Boltzmann Method. In this paper, a new boundary condition scheme is developed to model the no-slip wall condition in the presence of the body force term near the wall which is based on the Bennett extension. The error related to the new model is smaller tha...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998